Detect KISA copyright holder with parenthesized expansion#5133
Conversation
Signed-off-by: Vincent Gao <gaobing1230@gmail.com>
Signed-off-by: Vincent Gao <gaobing1230@gmail.com>
a8ee21a to
5eea338
Compare
Signed-off-by: Vincent Gao <gaobing1230@gmail.com>
5f6adda to
5b07d79
Compare
|
Thanks @gaoflow, taking a look soon, we probably also have to check where/why the regression was introduced at #5125 (comment) |
| (r"^[a-z].+\(s\)[\.,]?$", 'JUNK'), | ||
|
|
||
| # KISA with an opening parenthesized expansion, as in "KISA(Korea" | ||
| (r"^KISA\(Korea$", 'NNP'), |
There was a problem hiding this comment.
This is super specific to this failing case, and I'm not sure that this addition specific to korea is the best fix. There could be a larger issue which is casing this failure, which needs to identified and fixed.
We need to figure out where and why this regression happened.
@tsteenbe also pointed out the issue and that there are more regressions, so he'll be providing more examples, which would help figure out the root cause possibly.
|
On where the regression entered: the version table in #5125 bisects cleanly to the 31.2.6 → 32.0 boundary — The common factor in the failing case is a name token immediately followed by a parenthesized expansion with no space ( |
|
@gaoflow I can appreciate your usage of LLMs, but please refrain from using them to post comments in PRs and issues. We are bringing up a new org-wide policy to this effect. With that said, a simple bisect tells exactly where and when the regression was introduced, no LLM needed. Please see: A fix should read into that change. And I would need to see more examples |
| @@ -0,0 +1 @@ | |||
| Copyright (c) 2007 KISA(Korea Information Security Agency). | |||
There was a problem hiding this comment.
Can you tell where that's coming from exactly? The original issue has a different text line linked by @fviernau at:
https://github.com/openssl/openssl/blob/636dfadc70ce26f2473870570bfd9ec352806b1d/crypto/seed/seed_local.h#L11
Fixes #5125.
Summary
KISA(Korea, as a proper-name token before the broader middle-parenthesis JUNK rule.Copyright (c) 2007 KISA(Korea Information Security Agency).AUTHORS.rstas requested by the contribution guide.Tests
.venv/bin/python -m py_compile src/cluecode/copyrights.py tests/cluecode/test_copyrights.pygit diff --checkdetect_copyrights_from_lines()check for the KISA sample, expecting both the full copyright and holder.pytest tests/cluecode/test_copyrights.py -k kisa_seed_local --test-suite all -qpasses locally when using a pure-textnumbered_text_linesshim; the full local ScanCode test environment is blocked by missinglibmagicon this macOS arm64 setup.AI assistance was used under my direction.